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Probing The Cosmological Constant Through The 
Alcock-Paczynski Test Based on The Lyman-Alpha Forest 

Wen-Ching Lin 1,2,3 and Michael L. Norman 1,2 

ABSTRACT 

In recent years, the possibility of measuring the cosmological constant Ha 
through the application of the Alcock-Paczynski test to the Lyman Alpha (Lya) 
forest has been suggested (McDonald et al. 1999; Hui et ah 1999). Despite the 
theoretical uncertainties due to a few other cosmological parameters, some of the 
greatest difficulties we encounter concern the huge uncertainties due to cosmic 
variance and noise. In this paper, we propose a maximum likelihood estimation 
(MLE) method to deal with cosmic variance and noise using synthetic spectra 
of quasistellar objects (QSOs) from our cosmological hydrodynamic simulations. 
We demonstrate that the MLE method can overcome the cosmic variance prob¬ 
lem. Applying the MLE method, we find that we have more than 90% probability 
to determine Ha within 20% error and approximately of 66% probability to deter¬ 
mine Ha within 10% error by using 30 pairs QSO spectra when other cosmological 
parameters are assumed. Another important source of error is from noise in the 
flux spectra, and we have modeled the corresponding effect by studying artificial 
spectra with different kinds of noise added. We discover that the noise distri¬ 
bution does not have significant effect on the final cross-correlation functions as 
long as the signal-to-noise ratio (S/N) is fixed. Finally, a preliminary test and 
discussion about the sensitivities to other cosmological parameters are included 
in this paper as well. 

Subject headings: Lya forest, cosmological constant, AP test 


1. Introduction 

An important topic in cosmology is the determination of the energy densities of the 
various components of the Universe. Therefore constraining the values of their respective 
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fractional energy densities: baryon density (fig), matter density (12 m) and vacuum density 
(12a) becomes a vital part of understanding the Universe. First year WMAP results gives 
12 0 =1.02±0.02 (Bennett et al. 2003). When combined with galaxy clustering data, the 
Lyman a forest, and other CMB measurements, WMAP finds 12 s h 2 =0.0224±0.0009 and 
12 m = 0.268±0.0159 (Spergel et ah 2003). 

The results 12 0 ~ 1 and 12 M « 0.27 suggest the existence of a dark, exotic form of energy, 
which is smoothly distributed and contributes roughly 70% of the critical density (Turner 
et al, 1983; Peebles, 1984). It is summarized 12 a = |12m + |±| based on observations of 
the Type la supernovae (Perlmutter et al. 1999; Goobar 2000; Riess et al. 1998; Schmidt 
et al. 1998), which is consistent with studies based on different physics (Holder et al. 2000; 
Guerra et al. 2000). Since all these results are based on data at z < 2 (mostly < 1), other 
independent measurements of 12 a from data at an earlier epoch becomes important. This 
motivates us to implement the Alcock-Paczynski Test (AP test) on the Lyman Alpha (Lya) 
forest. 

In 1979, Alcock and Paczynski proposed a method which can be used to measure the 
geometry of the LIniverse (Alcock & Paczynski, 1979). The basic idea of this method is that 
for a spherical object in the sky, its physical size along the line of sight and perpendicular to 
the line of sight should be equal. This method can be extended to non-spherical cosmological 
structures, in which case the characteristic length of the correlation function of the structure 
will be used instead of the physical size of the object. The two-point correlation functions 
of galaxies and clusters have been suggested as candidates for the AP test (Ryden 1995; 
Ballinger et al. 1996; Matsubara & Suto 1996; Popowski et al. 1998). Recently the cross¬ 
correlation function of Lya forest clouds has been proposed as a good candidate for the AP 
test as well (McDonald et al. 1999; Hui et al. 1999; McDonald 2003). 

The geometrical basis of the AP test suggests the measurement of the value where 
and Ad relate to the scales of the object due to the Hubble flow expansion parallel and 
perpendicular to the line of sight, respectively. Observationally, the characteristic lengths 
parallel and perpendicular to the line of sight can be written in terms of the velocity sep¬ 
arations nil and v±. When Az is small, the velocity parallel to the line of sight is given by 


Ar>i 


Az 

TTz 


c = A v h + Av p 


( 1 ) 


where Ary is the velocity separation due to the Hubble flow expansion and Av p denotes 
the effect of the peculiar velocity. The transverse velocity separation is : 
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Av ± = H(z)Al = H(z)D a (z)A9 (2) 

where D A {z) is the redshift-dependent angular diameter distance, H(z) is the expansion rate 
and A l is the physical size of the object we are interested in. We can write Av± in a form 
expressing its relationship with cosmological parameters explicitly : Au_l = c f(z) A 9, where 


f( z ) = ~D a (z)H(z) 


E(z) f z dz' 

1 + z J 0 E(z') 


(3) 


with E(z) = [D m (l + z) 3 + Da( 1 + z) 3 ( 1+a; )] 2 where we specialize to the case with an 
equation of state u> = & — -1. 

Because the matter distribution expands with the Hubble flow at the same rate in all 
directions in a homogeneous and isotropic universe, the formation of structures should be 
equal in all directions statistically. Therefore, the auto-correlation function along the line 
of sight £(A V||) and the auto-correlation function perpendicular to the line of sight £(A 
u_i_) should be the same (see §4.2 for detailed definitions of correlation functions). Unfortu¬ 
nately, one can not observe the auto-correlation function perpendicular to the line of sight. 
However, it is suggested that one can overcome this problem by analytically deriving a cross¬ 
correlation function £ x (A v \\) = £ x ( a/A n 2 — An 2 ) of two parallel lines of sight based on 
the information from the auto-correlation function. Then by comparing with the observa¬ 
tional cross-correlation function, we can determine the cosmological model (McDonald et al. 
1999; Hui et al. 1999). Based on this idea, instead of analytically deriving cross-correlation 
functions, we use fully hydrodynamical simulations to provide more accurate results. The 
benefit of this approach is that the nonlinear effects (e.g., peculiar velocities, shock heating) 
are automatically included. We also able to generate many numerical paired QSO spectra 
for the purpose of statistical analysis and the comparison with real observational data. 

Figure (1) shows the flux cross-correlation functions in different cosmological models as 
taken from our simulations. As is evident, the flux cross-correlation function is a powerful 
descriminant between models. In the specific structures we are interested in, the Lya forest 
at redshift 2, the velocity separations of 100 - 600 krn/s correspond to comoving scales of 
1-6 Mpc, which is the characteristic size of voids lying between the absorbing clouds. 
Within the current paradign (e.g., Zhang et al. 1998), Lya forest absorbers are mildly 
overdense regions of gas that form a network of sheets and filaments that are nearly fixed in 
comoving coordinates. This makes them perfect candidates for the AP test. Complications 
due to nonlinear processes (shock heating, peculiar velocities) are taken into account in our 
analysis since they are included in the underlying simulations. 
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This paper is the first of several describing our methodology for applying the AP test 
to the Lyman alpha forest. Here we focus on the uncertainties of estimating Ha due to 
cosmic variance and noise in the quasar spectra assuming all other parameters are known. 
We describe a method, based on maximum liklihood estimation (MLE), that overcomes 
these problems given a sufficient number of quasar pairs. We show that uncertainties due to 
the cosmological parameters fib and erg are small, while the uncertainties due to the poorly 
constrained intensity of the UV background are of the same order as uncertainties clue to 
Ha- This latter issue is currently under investigation by us and will be reported on in a 
forthcoming paper. 

This paper is organized as follows : The cosmological simulations and the analysis codes 
are discussed in §2 and §3. Then we display our methodology in §4. Finally, we provide 
our results and concluding remarks in §5 and §6. Additional information about maximum 
likelihood estimation and error propagation are given in Appendix (A) and Appendix (B). 


2. Cosmological Simulations 
2.1. Cosmological Code 

We have performed several simulations of the z = 2 Lya forest in different cosmological 
models. All simulations were performed using our cosmological hydrodynamics code Enzo. 
Enzo incorporates a Lagrangean particle-mesh (PM) algorithm to follow the collisionless 
dark matter and a higher-order accurate piecewise parabolic method (PPM) to solve the 
equations of gas dynamics. In addition to the usual ingredients of baryonic and dark matter, 
Enzo also solves a coupled system of non-equilibrium ionization equations with radiative 
cooling for a gas with primordial abundances. Our chemical reaction network includes six 
species: HI, HII, Hel, Hell, Helll and (Abel et al. 1997; Anninos et al. 1997). The 
simulation starts with the initial perturbations originating from inflation-inspired adiabatic 
fluctuations. The BBKS (Bardeen et al. 1986) transfer function is employed with the stan¬ 
dard Harrison-Zel’dovich power spectrum. Another important component in the simulations 
is an ultraviolet (UV) radiation background which ionizes the neutral intergalactic medium. 
Haardt & Madau (1996) have provided a UV radiation field with a radiation transfer model 
in a clumpy universe based upon the observed quasar luminosity function. Enzo starts to 
import their homogeneous UV background spectra at redshift 7 and increases the inten¬ 
sity of the spectra at redshift 6 to generate photoionization and photoheating rates in our 
simulations. 
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2.2. Lya Forest Simulations 

In this work we performed twelve Lya: forest simulations using Enzo. All simulations 
were done on the Origin 2000 supercomputer at the National Center for Supercomputing 
Applications (NCSA). We have six major simulations for our main analysis work and their 
comoving box size are 37.3 Mpc (25.73 Mpc/h, h = 0.69) with 256 3 dark matter particles 
and a 256 3 grid for the evolution of gas dynamics. The other cosmological parameters used 
here are erg = 0.73, f 1b = 0.04, 12 0 = 1.0, h = 0.69 and the power spectrum index n = 1.0. 
The only parameter varied over the six simulations is 12 a, which is given values of 0.0, 0.5, 
0.6, 0.7, 0.8 and 0.9. In addition to these, we have another six simulations with 12 a = 0.7 
where we varied other cosmological parameters: <t 8 , photoionization parameter J and 12# (or 
h) for the purpose of testing parameter sensitivities, (see Table (1) and § 5.1 for a summary). 


3. Artificial QSO spectra 
3.1. Spectrum Generator 

In order to compare our simulation results to observations, we need to produce realistic 
artificial flux spectra from our simulations. The spectrum generator we used here starts at 
the point with the lowest neutral hydrogen density inside the box, shooting photons along 
random lines of sight through the box. Theoretically, one can start at any point in the box 
and it is just a matter of choice to choose the point with the lowest Nhi density. Here 
we denote this spectrum generating method as SGA. Another spectrum generating method 
often used is to shoot lines of sight parallel to the edge of the box, and we denote this 
method as SGB. One important advantage of SGA is that it is actually closer to the real 
observational case. This is because when we observe the Universe, the observation always 
starts at a single point, Earth, and collects spectra from different lines of sight. Therefore, 
if this kind of observation does introduce statistical errors, we want to reproduce the same 
bias in the numerical simulations. The method SGA also avoids the dependence of nearby 
lines of sight in the method SGB. The method SGA has been used and carefully studied by 
Zhang et al. (1998), Bryan et al. (1999), and Machacek et al. (2001). 

The method SGA calculates the transmitted flux of a QSO at redshift z as e -1 " 1 ', with 
the optical depth t v given by 


/ to 

n H i(t)cr u cdt (4) 

where c is the speed of light, rifjj is the number density of the HI absorbers, cr v is the 



6 


absorption cross-section, t is the corresponding cosmic time at redshift z and to is the cosmic 
time today. Integration is performed along the line of sight from the QSO to the observer. 
This can be written in a form more suitable for computation (Zhang et.al. 1997) as 


T„{Z) = 4*- f 

> s/TTUo J 2 


z ° n Hl{z) a 2 


exp 


(! + *)£ 


1 + 7 


§2 > dz 


where z is the redshift, a Q is the resonant Ly-a cross section, v D is the Lyo; rest frequency, 
v is the peculiar velocity along the line-of-sight and v is the redshifted frequency, b is the 
effect of Doppler broadening on the absorption cross section and is equal to \j2kT/m p , where 
k is the Boltzmann’s constant, T is the gas temperature and rn p is the mass of a proton. 
This equation, parametrized to order v/c, also needs the scale factor a to be specified, which 
is given by the Friedman equation, 

a = H 0 \J 1 + — 1) + Da (a 2 — 1) 

In order to generate paired QSO spectra, we first calculated the comoving separation d 
of two QSOs at the desired redshift based on the known angular separation under a given 
cosmological model. Then from the point with the lower neutral hydrogen density, point A, 
and a given random direction r, the plane S : (x - A) ■ f — 0 is uniquely determined. Then 
on the circle with center A and radius d/2 on S, we chose two points C and D where C-A-D 
lay on a line. Then we generated two parallel QSO spectra from C and D both along the 
direction f. Figure (2) shows a pair of simulated QSO spectra with a resolution identical to 
the Low Resolution Imaging Spectrometer (LRIS) on the Keck telescope. 


3.2. Spectra Degrading and Signal to Noise Properties 


In order to compare with the observational data, we degrade our ideal spectra to the res¬ 
olution of the desired instrument. The resolution of each ideal QSO spectrum was degraded 
by convolving the entire spectrum with a normalized Gaussian function : 

1 


f(x) = 


<7 


exp[ 


—x 


y/2ir 2a 2 


(5) 


with a = ^t==£ , where FWHM is the full width half maximum of the spectral resolution 
of the desired instrument. The convolution subroutine used was based on FFT algorithms 
from Numerical Recipes (Press et ah,1988). Figure (3) shows the degraded simulated spectra 
at different resolutions. 


We also examined the effect of different S/N in our simulated spectra based on Gaussian- 
distributed noise. The noise comes from a Gaussian distribution with zero mean and unit 
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variance: 

p(y)dy = -7= exp[-y 2 /2}dy (6) 

V Z7T 

and we denote the noise as N gauss . Then with a given S/N value, we derived the final flux 
spectra by adding the corresponding noise to the original flux spectra: 


ffinal forig T 


N 

1v qauss 

~sJW 


(7) 


where ffi na i is the final flux with noise and f orig the flux after degrading the spectrum. Then 
we define the overflux Sf as 7x1 ? where / is the mean flux of the spectrum at a given redshift 
interval ( 1.754 < z < 1.954 in this paper) and is calculated by averaging the whole data 
points in the spectrum. So we have : Sf(zi), Sf(z 2 ), ■ ■ ■ 6f(z n ) from a QSO spectrum, where 
Zi, i — 1,2..., n are the corresponding redshifts at each data point. Then we calculate the 
corresponding flux power spectrum of these data points using the FFTW algorithm (Frigo 
et al. 1998). Here k is defined as • Figure (4) shows the flux power spectrum. Adding 

noise increases the amplitude of the flux power spectrum, which means introducing small 
scale power. Beyond k = 0.01, the noise dominates the amplitude of the power spectrum. 
This tells us that the data beyond k = 0.01 is not reliable, due to the noise effect. 


Because the distribution of noise in the observation is unknown, we chose several com¬ 
mon distributions of noise to add to our simulated spectra for the analysis work. We com¬ 
pared the results based on Gaussian-distributed noise with other noise distributions including 
the Poisson and Gamma distributions. We used the routines in Numerical Recipes (Press 
et ah, 1988) to generate noise from the above distributions. For a Poisson distribution, the 
total probability of integer j (event j) is : 


ri+ e 


•3p~x 


Prob(j) = / P x (m)dm = 


x J e 


'3-t 




( 8 ) 


The Poisson noise, N po i SSOn is a random deviate drawn from the Poisson distribution with 
unit mean. Similarly, a Gamma distribution of integer order a > 0 is the waiting time to 
the a t h event in a Poisson random process of unit mean. We know that a Gamma deviate 
has a probability P a (x)dx of occurring with a value between x and x + dx, where 


P a (x)dx = 


1 g X 




dx , x > 0 


(9) 


where T(x) is the gamma function. Then the noise is a deviate distributed as a gamma 
distribution of integer order 1, which is the waiting time to the first event in a Poisson 
process of unit mean. Figure (5) shows the flux cross-correlation functions at S/N = 10 
with different noise distributions. We conclude that for a given S/N ratio, the distribution 
of noise does not have a significant influence on the final cross-correlation function. 



4. Methodology and Numerical Procedures 


4.1. Cosmological Simulations and Simulated Spectra 

In this work, we used data from six simulations with different values of 12 a as discussed 
in §2.2 (Simulations Set A in Table (1)). Before more QSO pairs are available from Sloan 
Digital Sky Survey (SDSS), approximately a few dozen QSO pairs with good quality should 
be observed through Keck (Kirkman et al. 2002). To provide an useful statistical study, 
we have to generate several groups of simulated pairs where each group has approximately 
the same number of pairs as are available observationally. We also know that pairs with 
angular separations between 1’ - 3’ contain good information about geometry in their cross- 
correlation functions (McDonald et al. 1999). Therefore, we generated 300 paired QSO 
spectra with an angular separation of 120” using the spectrum generator code described in 
§3.1. The length of a spectrum is of A z = 0.2, which is around 7 ~ 8 times the box size. 
Since we shoot lights in random directions, each line of sight will not go through the same 
point in the box. 

Then we degraded the ideal QSO spectra to a FWHM of 300 km/s and a pixel size 
of 130 km/s, which is comparable to the LRIS of the Keck telescope. Gaussian-distributed 
noise with S/N = 10 and 20 were also added to the degraded simulated LRIS spectra for 
our analysis. These values were chosen comparable to the observational data which will be 
available in the near future. 


4.2. Cross-Correlation Functions 

Given a pair of QSO spectra, we first calculated the overflux at each point of a spectrum. 
The overflux <5/ and the mean flux / are defined in § 3.2. However instead of considering 
one QSO spectrum as we did in § 3.2, we now have a pair of QSO spectra, so the mean flux 
/ is averaged over the whole data points of the paired spectra. Therefore we have : 


fifi(zi), 5f\(z2), • • • 5fi(zr,) from the first QSO spectrum 
(5/2Qi), < 5 / 2 ( 22 ), • • • <5/ 2 ( 2 n ) from the second QSO spectrum 

where 2 /, i = 1,2, ... n, are the corresponding redshifts at each point of the QSO 
spectra. Thus for a point <5 /i( 2/) from the first spectrum and another point <5/2(2/) from the 
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second spectrum, their parallel velocity separation can be expressed as: 

An,, = 1 (10) 

" 1 + z y 1 

where | z 3 — z , | are small and z = Z1 + Zn . The cross-correlation at a given parallel velocity 
separation Any is denoted by £(Auy) and is calculated by : 

f(Au||) =< 5f 1 (z i )S f2 (z j ) > (11) 

averaging over all possible permutations of (i.j) which satisfy Equation (10). 

The correlations we are concerned about here are velocity separations of 600 krn/s or 
less, which is less than 10 Mpc and is definitely much smaller than | box size. Therefore, 
even though the lines of sight wrapped the box a few times, the range of the correlation 
function we are interested in is small enough and should not be affected by our box size. 

Figure (7) shows the cross-correlation functions of two QSO pairs from the 17 a = 0.7 
simulation. The large difference of the cross-correlation functions between the two pairs 
reveals the significant impact of cosmic variance, which is the reason why we introduced the 
MLE method (see Appendix (A) and §4.4 for more details). 


4.3. Probability Density Functions 

In order to apply the maximum likelihood method (MLE) to our analysis work, we need 
to provide accurate probability density functions first (Appendix (A)). For each of the 6 
simulations, we have generated 300 paired QSO spectra with an angular separation of 120”, 
providing 300 cross-correlation functions per simulation: £(A vy ; A 9 | 17a), where A 9 — 
120” and 17a = 0.0, 0.5, 0.6, 0.7, 0.8 and 0.9. We denote £(A vy ; A 9 | 17a) f° r given A 9 
and 17 a at a fixed A as £(Au,p. At a given A vj each cross-correlation function provides 
one value of £(Auip, so we have 300 data points of £(Auip at Auj) under given 17 a and A 
9. We use these data to construct a probability density function of £(Avp at AujL denoted 
by PDF(^(Auy r )), for known 17 a and A 6. We repeat the procedures for different A vy and 
17 a- Figures (8), (9) and (10) show the PDF(£(Ar?||)) at three different Au^ for different 
cosmological models. 


4.4. The Likelihood Calculation 

Given a QSO pair, pair A, we first calculate the corresponding cross-correlation function 
£a(Au||) based on Equation (11). We denote the points we use in this work as (An,|, ^(Auy)), 



(A-Um, £ a (Avm)), . (Avj|", ^(Au™)). In our work, we use five points: (Au,j = 129 km/s, 

^a(Au||)) , (Any = 259 km/s, £ A (Aujp) , (Au| = 388 km/s, £ A (Aujj)), (Au| = 518 km/s, 
^(Aujf)) and (Au| = 647 km/s, £ a (Au|)). For each given 0 A , using the PDF(Auy | 0 A ) 
generated in §4.3, we can find that the probability of getting (Auj|, ^(AnS)) is f\. 

Assuming we have more than one paired QSO spectra: Pax, Pa 2 , ••••, Pat, we can 
combine the probability / A j(uj), j = 1, ..., r, where is a given velocity bin for all pairs. 
Then L v . = II^ = , f] Aj , where all (uj, f l Aj ) are independent to each other since P 41 , Pa 2 , • •••, 
Pat are from different parts of the sky. By repeating the above calculation in different 
cosmological models, we can derive a likelihood function L Vi (Vt) at the velocity bin ry. Since 
the data points at different velocity bins do not share the same probability denstity function, 
we can combine these data and derive a final likelihood function : L(f!) = n/ =1 L„.(n) 

Figure (11) shows the likelihood function of a group of 30 pairs at different velocity 
separations. The likelihood functions have similar results at different velocity separations and 
the combination at all velocity separations provides a better statistics and an useful result. 
By maximizing the likelihood function, we can determine ff A . The advantage of using the 
MLE method here is that we can derive ff A without collecting many paired QSO spectra with 
the same angular separation and redshift to determine a reliable cross-correlation function 
beforehand. 

The above analysis ignores the correlations between data points at different velocity 
separations within a given pair spectrum. The fact that the combined liklihood function 
(Fig. 11) is very close to the v=129 km/s liklihood function, and that the other velocity 
separations have similar profiles, suggests that this may indeed be the case. We would like 
to make two points in defense of our method, however. We could just as well base the 
determination of 0 A on a single velocity separation, in which case the issue goes away and 
our MLE method is essentially unchanged. Second, we view combining the different velocity 
bins as a way of making use of all the data that is available. One can look for the bias 
introduced by treating them as independent a posteriori. In Sec. 5.3 we apply our method 
to a blind sample of pair spectra taken from a A = 0.7 simulation. As shown in Fig. 16 and 
discussed in Sec. 5.3, the bias is dominated by sample size. We find that a sample of 30 
pairs with S/N=20 is sufficient to recover the correct value of A within uncertainties. 
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5. Results 

5.1. Sensitivities to other cosmological parameters 

Our preliminary test of several other cosmological parameters includes cr 8l photoion¬ 
ization parameter J, Qg and h. There is a large scatter in the value of <r 8 measured by 
various scientists (Ballinger et al. 1996; Ratra et al. 1997; Efstathion et al. 2001; Holder et 
al. 2001; Wang et al. 2002) and we thereby choose the value of a 8 to be 0.73 with a 10% 
varying range. Figure (12) shows that fluctuations in cross-correlation functions caused by 
the uncertainty of <x 8 is relatively small and ignorable compared with that of 12 a. The ranges 
of Qb and h are first constrained by the relation D#/? 2 ~ 0.019 (Bnrles et al. 1999; Tytler 
1999). At the same time, we pick the value of Hubble Constant with h = 0.72 ± 0.08 based 
on the HST key project (Freedman et al. 2001). Our result indicates that the uncertainties 
in VLb and h do not have significant influence in this work (Figure (13)). 

Another important parameter J indicates the intensity of the photoionization back¬ 
ground. The standard J = 1.0 in onr simulations corresponds to the Haardt & Madau UV 
radiation field (1996). This ionization background has been implemented in many previous 
Lya simulations (eg. Bryan et al. 1999). The uncertainty in J is large and in order to be 
safe, we vary the value of J from 0.25 to 3.0. From Figure (14), we know that the fluctua¬ 
tion in the final cross-correlation function caused by J is approximately at the same order 
as caused by 12 a- Therefore, quantifying the value of J becomes important and a detailed 
study will be done in our next paper. 

We also compare our results with the parameter studies discussed in McDonald (2003). 
In McDonald (2003), he varies five free parameters in the Lya forest model including the 
power spectrum amplitude Ai, the power-law index of the power spectrum at k\ , the factors of 
a power-law temperature-density relations T 14 and 7 -I, and the mean flux F (See McDonald 
2003 for detailed definitions of these parameters). While in our work, we focus on changing 
the cosmological parameters including <r 8 , 12^(11) and the photoionization parameter J to 
study the corresponding results of the output simulated spectra calculated by the Spectrum 
Generator mentioned in §3.1. The two groups approach this issue of parameter sensitivities 
in different ways and both of them show that measuring other parameters accurately is one 
of the most important issues in a true AP test. 
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5.2. Probability density functions with S/N properties 

It is important to understand the effect of noise in our analysis work. We first pay 
attention to the probability density functions drawn from the data sets with S/N = 10 and 
S/N = 20 in Figure (15). This figure shows that the probability density function with a 
high S/N ratio has a smaller standard deviation than that of a lower S/N. This indicates 
the noise spreads out the probability density function of correlations and we should provide 
different observational data points their corresponding probability density function during 
the analysis procedures. 

The instrumental noise gives an error bar at each data point in a flux spectrum, and 
the noise at each data point is proportional to the value of the flux. Figure (6) includes the 
1 cr error bar at each data point for a flux spectrum. The uncertainties in the flux spectrum 
definitely result in the uncertainties in the cross-correlation Auction. Based on the standard 
error propagation formula (Appendix B), we can calculate the corresponding error bars in 
the cross-correlation function as shown in Figure (7) for two randomly chosen pairs. In the 
upper panel of Figure (7), the error bars are derived from the flux spectra with S/N = 10 
while in the lower panel the error bars are calculated based on the S/N = 20 flux spectra. 

We should notice that the large differences in the cross-correlation functions of the two 
pairs in both panels are the result of cosmic variance. In the upper panel, the difference 
of cross-correlation in the two pairs is around the same order but slightly larger than the 
error bars in the cross-correlation drawn from the SN =10 flux spectra. Therefore, for each 
data point in a cross-correlation function the uncertainty due to the noise at each point 
could possiblely yield an uncertainty in the final likelihood function and thereby Ha. Due 
to this, quantifying the uncertainty in our estimation of Ha due to the nosie is an important 
problem. More detailed work will be done in the near future. 

This is because one can never claim the exact value of the correlation while an error 
bar is concerned. Therefore, to quantify the uncertainty in the Ha due to the noise is an 
important problem and more detailed work will be done in the near future. 


5.3. Probability of Obtaining the Correct value of Ha 

To investigate the question about the number of pairs needed for deriving a reliable value 
of Ha, we take 300 paired QSO spectra from our Ha = 0.7 simulation and pretend that they 
are observational data from the real Universe. First, we resample our 300 paired QSO spectra 
into 30 subgroups of 10 pairs each. As described in §4.4, in each subgroup we calculated the 
likelihood based on the data points of the 10 cross-correlation functions and then derived 



13 


the combined-likelihood of the total 10 pairs in different cosmological models. Thus, in each 
subgroup, we derive a likelihood function of Ha. By maximizing the likelihood function (or 
the log (likelihood) function (LLK)), we derive the corresponding Ha (see Appendix (A) for 
details). This gives a distribution for values of Ha based on the 30 subgroups as shown in 
Figure (16). If we parametrize the distribution as a Gaussian, we find that Ha is 0.6 ± 0.145, 
where 0.6 is the mean and 0.145 is the standard deviation of the distribution from the 30 
subgroups. 

Before any further analysis, we want to define the uncertainty in Ha first. In this paper, 
we denote an x % uncertainty in Ha as e x % and it means the derived H^ is between (^fj^p)HA 
and ( 1 ° 1 ( ^ x )Ha. Here we have assumed that the real value of Ha in the Universe is 0.7 and 
thus the 20 % uncertainty of Ha ranges from 0.56 to 0.84 (within e 20 %). Therefore, by 
calculating 


P 




exp 


1 (x — g) 2 
2^2 


( 12 ) 


where g = 0.6 and a = 0.145, we find that the probability of getting Ha within e 2 o% 
is 56 %. Similarly, the ei 0 % for the value of Ha ranges from 0.63 to 0.77. So by replacing 
the lower and the upper integration limits in Equation (12) to 0.63 and 0.77, we derived the 
probability of obtaining Ha within e w % is 30 %. By repeating the above procedures, we also 
conclude that the probability of getting Ha within e 5 % ( 0.665 < Ha < 0.735) is 15 %. 

To investigate the importance of sample size, we also split the 300 paired-spectra into 20 
subgroups with 15 pairs in each subgroup, 15 subgroups with 20 pairs in each subgroup, 12 
subgroups with 25 pairs in each subgroup and 10 subgroups with 30 pairs in each subgroup. 
Then in each case we calculated the distribution of Ha and the probability of deriving Ha 
with an uncertainty of e 20 %, e w % and e 5 %. For the case we have 30 paired QSO spectra in 
each subgroup (10 subgroups), the probability to confine Ha within e 20 % is 94%, within ei 0 % 
is 66 % and within 65 % is 36%. A summary of results is displayed in Table (2). Each row 
in the table indicates one statistical study. The first column is the number of QSO pairs in 
one subgroup while the second column is the number of subgroups in each study. The third 
column denotes the result of Ha- Finally, the fourth, fifth, and the sixth columns are the 
probability of getting Ha within an uncertainty range of e 20 %, e w % and e 5 %. 



6. Discussion and Conclusion 


Even though the possibility of using the AP test on the Lyo forest to measure Da has 
been emphasized (ffui et al. 1999; McDonald et al. 1999), the practical work of dealing 
with real observational data is not easy. There are several major difficulties in this work 
including cosmic variance, uncertainty of other cosmological parameters, understanding of 
spectral resolution along with S/N properties and the continuum fitting. The main focus of 
this paper is to produce noisy low resolution spectra which are comparable to the real hRIS 
observational data and show how to overcome the errors due to noise and cosmic variance. 
Aside from this, a preliminary examination of the parameter sensitivity is also included. 

Based on the study of noisy low resolution spectra generated from our simulation as 
discussed in §3.2, we found that adding artificial noise into the QSO spectra increases the 
amplitude of the flux power spectrum, especially for k > 0.01 . This is because random noise 
and small structures in the flux spectra are indistinguishable. Therefore, we conclude that 
data in k-spaee beyond k = 0.01 is untrustworthy for our purposes. 

In §5.1, we provide three sets of simulations with varying 0 % photoionization parameter 
J, Qb to study parameter sensitivities. We have found that the uncertainty of the photoion¬ 
ization parameter J has a significant influence on the cross-correlation function as well as 
Da- Therefore, to constrain the range of J becomes an important goal. In this paper, instead 
of doing a joint estimation for J and Da, we simply assumed J was known. Estimating J is 
equivalent to estimating the mean absorption in the Lya forest which is sensitive to how one 
fits the continuum level. We are currently extending our analysis to include uncertainties 
due to continuum fitting, and will report on that work in our next paper. 

In order to handle the cosmic variance problem, we introduced the MLE technique in the 
analysis procedures. We then conclude that based on 30 paired QSO spectra with an angular 
separation of 120”, we have 94% confidence to determine Da within 20 % error (e 2 o%), 66 % 
condolence to determine Da within 10 % error (ei 0 %), and 36 % confidence to determine Da 
within 5 % error ( 05 %),. Here we have taken finite spectral resolution and noise into account. 
Our results agree with the results of McDonald (2003) which claims with N pa i r = 13 (d/1 ') 2 
= 52 can measure Da to ± 0.03 or ±0.04. 

The process of gathering paired QSO spectra is time-consuming. Even before enough 
QSO pairs have been gathered to constrain the value of Da, we can achieve a useful result 
by ruling out unlikely models. Under the assumption that we live in a universe with a high 
value for Da, we can rule out the SCDM model at 90 % confidence by utilizing only 10 
pairs of QSO spectra if we assume the the other cosmological parameters are known (Figure 
(16)). The tight constraints placed on the other cosmological parameters by WMAP in 
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combination with galaxy clustering data structure encourage us that this will be feasible in 
the near future. The largest remaining uncertainty is the mean level of absorption in the Lya 
forest. We are pursing this issue presently and will report our findings in the near future. 


We highly appreciate Brian O’Shea, Tridivesh Jena and Pascal Paschos for help editing 
and organizing the article. We also grateful to the useful discussions with Tridevish Jena and 
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Cross-correlation functions of different cosmological models 

based on LRISB resolution spectra with S/N = 10 



Fig. 1.— Flux cross-correlation functions for several cosmological models with different 
12 a- Each curve is the average of 300 cross-correlation functions while each cross-correlation 
function was derived from a paired QSO spectra based on our simulations. The corresponding 
FWHM is 300 km/s, the pixel size is 130 km/s and the signal-to-noise ratio is 10. The angular 
separation is 120”. The lower solid curve is the SCDM model and the higher solid curve is 
the 12 a = 0.9 model. The dotted curve is the 12 a = 0.5 model, the dashed curve is the 12 a = 
0.6 model, the long-dashed curve is the 12 a = 0.7 model and the dotted-dashed curve is the 
12a = 0.8 model. 









A simulated spectrum with different resolutions 

z = 2.0 



redshift 

Fig. 3.— We degraded an ideal flux spectrum from our simulation to different resolutions 
based on desired instruments. HIRES: High Resolution Echcllc Spectrometer of the Keck 
telescope, FWHM = 8 krn/s. ESI: An Echcllette Spectrograph and Imager for the Keck II 
Telescope, FWHM = 55 krn/s. LRIS: Low Resolution Imaging Spectrometer resolution of 
the Keck telescope, FWHM = 115 - 300 km/s (115 km/s in this Figure). Lick: FWHM: 250 
km/s. 
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power spectrum of different S/N properties 










10" 3 10" 2 1 o _1 


k (l/(km/s)) 


Fig. 4.— The averaged flux power spectrum based on 300 simulated QSO spectra. We 
degraded the ideal spectra to LRIS resolution and then added different amounts of noise. 
The solid curve is the LRIS spectra without noise, the dotted curve is with S/N = 20 and 
the dashed curve is with S/N = 10. The normalization factor is not specified here. 
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Cross-correlation functions of different noise distribution 

based on LRIS resolution spectra with S/N = 10 



Fig. 5.— Flux cross-correlation functions with different noise distribution.Each curve is the 
average of 300 cross-correlation functions while each cross-correlation function was derived 
from a paired QSO spectra based on our simulations. The corresponding FWHM is 300 
km/s, the pixel size is 130 krn/s and the signal-to-noise ratio is 10. The angular separation 
is 120”. The four solid curves are SCDM (the lowest solid curve), 11 a = 0.5 (the second 
lowest solid curve), Ha = 0.7 (the second highest solid curve) and Ha = 0.9 (highest solid 
curve) cosmological models with Gaussian noise. The long dashed curve is the Ha = 0.7 
cosmological model with Poisson noise distribution and the dotted curve is the Ha = 0.7 
cosmological model with Gamma noise distrubution. 
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Cross-correlation function of 2 pairs 



Fig. 7.— Flux cross-correlation functions of two QSO pairs from the O y \ = 0.7 simulation. 
The solid curve denotes the result of the first pair while the dashed curve denotes the result 
of the second pair. The error bars were derived from the flux spectra with S/N = 10 and 
S/N = 20. The angular separation is 120”. 
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probability distribution function 



- 0.01 0 0.01 0.02 

correlation 


Fig. 8.— PDF(£(Au|p) of cosmological models with different 12 a- The x-axis is the flux cross¬ 
correlation and the y-axis f(x) is the probability density function for x. The A'C|j equals to 
129 km/s. The pair separation is 120”. The solid curve is the SCDM model, the dotted curve 
is the 11 a = 0.5 model, the short-dashed curve is the Ha = 0.7 model and the long-dashed 
curve is the Ha = 0.9 model. We have fitted and normalized the original distribution curve 
to a normalized Gaussian distribution function in each cosmological model. 
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probability distrubution function 


v = 388 km/s 



Fig. 9.— PDF(£(Ay|p) of cosmological models with different Ha- The x-axis is the flux cross¬ 
correlation and the y-axis f(x) is the probability density function for x. The At/ equals to 
388 km/s. The pair separation is 120”. The solid curve is the SCDM model, the dotted curve 
is the Ha = 0.5 model, the short-dashed curve is the Ha = 0.7 model and the long-dashed 
curve is the Ha = 0.9 model. We have fitted and normalized the original distribution curve 
to a normalized Gaussian distribution function in each cosmological model. 
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probability distribution function 


v = 647 km/s 



Fig. 10.— PDF(^(Av|j r )) of cosmological models with different Ha- The x-axis is the flux 
cross-correlation and the y-axis f(x) is the probability density function for x. The At/ equals 
to 647 km/s. The pair separation is 120”. The solid curve is the SCDM model, the dotted 
curve is the 12 a = 0.5 model, the short-dashed curve is the 12 a = 0.7 model and the long- 
dashed curve is the 12 a = 0.9 model. We have fitted and normalized the original distribution 
curve to a normalized Gaussian distribution function in each cosmological model. 





The result of 30 pairs 
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Fig. 11.— The log(likclihood) functions at different velocity separations based on 30 QSO 
pairs. The functions at different velocity separations reveal similar results and the final 
combined result gives a better statistics. 
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cross-correlation functions in different cosmological models 
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Fig. 13.— We compare fluctuations in cross-correlation functions caused by the uncertainties 
of O/j and Oa- Some other important parameters of these five simulations are cr 8 = 0.73 and 
J = 1.0. When O s = 0.03, h=0.7967; when O s = 0.05, h = 0.617. 
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cross-correlation functions in different cosmological models 
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Fig. 14.— We compare fluctuations in cross-correlation functions caused by the uncertainties 
of J and Ha- Some other important parameters of these five simulations are Qb — 0.04, h 
= 0.69 and cr 8 =0.73 . 







probability distribution function 



- 0.01 0 0.01 0.02 

correlation 

Fig. 15.— The probability density functions at v = 388 km/s with SN = 10 and S/N = 20 
for different cosmological models. The x-axis denotes the values of cross-correlation and the 
y-axis represents the probability density function f(x) for x. The PDF of a lower S/N has 
larger standard deviation than that of a higher S/N. 
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simulations 

<?8 

J 

Db 

h 

Da 

Al 

0.73 

1.0 

0.04 

0.69 

0.0 (SCDM) 

A2 

0.73 

1.0 

0.04 

0.69 

0.5 

A3 

0.73 

1.0 

0.04 

0.69 

0.6 

A4, BO, CO, DO 

0.73 

1.0 

0.04 

0.69 

0.7 

A5 

0.73 

1.0 

0.04 

0.69 

0.8 

A6 

0.73 

1.0 

0.04 

0.69 

0.9 

B1 

0.657 

1.0 

0.04 

0.69 

0.7 

B2 

0.803 

1.0 

0.04 

0.69 

0.7 

Cl 

0.73 

0.25 

0.04 

0.69 

0.7 

C2 

0.73 

3.0 

0.04 

0.69 

0.7 

D1 

0.73 

1.0 

0.03 

0.7967 

0.7 

D2 

0.73 

1.0 

0.05 

0.617 

0.7 


Table 1: A list of cosmological parameters of our simulations. The varied parameter in Set 
A is 14 a, in Set B is a 8 , in Set C in J and in Set D is Vt B (equivalent to h). 


gp # D A P(e20%) P(eio%) P(o>%) 


10 

30 

0.6 ± 0.15 

56 % 

30 % 

15% 

15 

20 

0.6 ± 0.1 

64 % 

34 % 

17 % 

20 

15 

0.64± 0.1 

77% 

44 % 

23 % 

25 

12 

0.7 ± 0.1 

80 % 

48 % 

25 % 

30 

10 

0.7 ± 0.07 

94 % 

66 % 

36 % 


Table 2: Probability of getting A within 20%, 10% and 5% error. Each row in the table 
indicates one statistical study. The first column is the number of QSO pairs in one subgroup 
while the second column is the number of subgroups in each study. The third column denotes 
the result of 12 a- Finally, the fourth, fifth, and the sixth columns are the probability of getting 
0 A within 20%, 10% and 5% error. 
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A 


A. Maximum-Likelihood Estimation 

For a random sample x\, x 2 , . x n from a population with a probability density function 

(PDF) depending on a parameter 9, the function t = t(xi, x 2 , . x n ) is called a ’’statistic” if 

it does not depend on any other unknown parameters. Then the term ’’estimator” denotes a 
function, method or prescription used to find a value of an unknown parameter. In general, 
t is an estimator of the unknown 9. For a continuous or discrete population which has 

probability density function f(x | 9), the likelihood of n observations xi, x 2 , . x n for a 

specific 6 is given by 

L{x i, x 2 , ..., x n | 9) = II i =1 f(xi | 9) (Al) 

We can think of L{xi,x 2 , ...,x n \ 6) as a function of 9 and call it the ’’likelihood function”, 
denoted by L or L(x_ \ 9). To be more general, if we consider n independent experiments 
with the same physical parameter 9 from these n sets, x 2 , ..., x n , of observations, the 
corresponding likelihood functions of each set are LQzq | 9), L(x 2 \ 9),..., and L(x n j 9), 
respectively. Then the combined-likelihood of all observations is : 

L(x i,X2, ...,Xn\ 9) = n. ; 1 /l(Xji I 9)U i2 f 2 (x i2 I 9) . n in f n (x in \ 9) (A2) 

which is equivalent to 


log(L(xi,X 2 ,...,Xn | 9) = log(fi(xn \ 9))+ ^log(f 2 (x i2 \ 9)) +.+ ^ log(f n (x in j 9)) 

i2 in 

(A3) 

Maximizing the combined likelihood in Equation (A2) or its equivalent expression in 
Equation (A3) gives us an estimation of 9 denoted by 9 (Frodesen et al. 1978). In this work, 
the unknown parameter 9 is chosen to be Da- Each Xi is a data point in the corresponding 
cross-correlation function. The probability density function fi(xi \ 9) is the PDF of the 
cross-correlation at a velocity separation for a given Da while other parameters are fixed 
(Details are described in §4.3 and §4.4). 








B. Error Propagation 


Let E[f(x o)] denote the error of any given function f(x) at xq. As mentioned in §4.2, 
the mean flux / of a paired QSO spectra: 


i = YTi=iUi( z i) + h(zi)) 

7 2 n 


(Bl) 


where n is the number of points in a spectrum, fi(zi) and f 2 (zi), i=l, 2, ... n are the 
data points from the first and second flux spectra of a QSO pair. Then based on the error 
propagation formulas (Harrison, 2001) we have 


m 




2 n 


(B2) 


where E[j\ (z t )\ is the error of the point fi(zi) and E[f 2 (zi)] is the error of the point / 2 (^). 
For the overflux S f = we have E[/ — /] = \JE[f] 2 + E[f] 2 . So 


E(Sf) = E[^-jL ] = Sfy 




'S [/] 2 + B[/T , ,E[f \ 2 


(/-/)= 


/ 


(/-/) : 


+(_ r )! 


(B3) 


From §4.2, £(Au||) =< Sfi(zi)Sf 2 (zj) >, where the average is over all possible (i,j) 
permutations. We consquently get 


E[5f 1 (z i }5f 2 (zj) 


$ fl{ z i)3 fl{Zi) 


E\S fl (zj)\ ( E\S f 2 (zj)\ 


(B4) 


Therefore, 


£[£(Au,|)] 


J_ E\fifi( z i)5f 2 (zj)] 

N \ij $fl( Z i)$f2( Z j) 


(B5) 



